A Linear Classifier Outperforms UCT in 9x9 Go

نویسنده

  • N. Sylvester
چکیده

The dominant paradigm in computer Go is Monte-Carlo Tree Search (MCTS). This technique chooses a move by playing a series of simulated games, building a search tree along the way. After many simulated games, the most promising move is played. This paper proposes replacing the search tree with a neural network. Where previous neural network Go research has used the state of the board as input, our network uses the last two moves. In experiments exploring the effects of various parameters, our network outperforms a generic MCTS player that uses the Upper Confidence bounds applied to Trees (UCT) algorithm. A simple linear classifier performs even better.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Grid Coevolution for Adaptive Simulations: Application to the Building of Opening Books in the Game of Go

This paper presents a successful application of parallel (grid) coevolution applied to the building of an opening book (OB) in 9x9 Go. Known sayings around the game of Go are refound by the algorithm, and the resulting program was also able to credibly comment openings in professional games of 9x9 Go. Interestingly, beyond the application to the game of Go, our algorithm can be seen as a ”meta”...

متن کامل

LRTDP Versus UCT for Online Probabilistic Planning

UCT, the premier method for solving games such as Go, is also becoming the dominant algorithm for probabilistic planning. Out of the five solvers at the International Probabilistic Planning Competition (IPPC) 2011, four were based on the UCT algorithm. However, while a UCT-based planner, PROST, won the contest, an LRTDP-based system, GLUTTON, came in a close second, outperforming other systems ...

متن کامل

Every Team Deserves a Second Chance: An Interactive 9x9 Go Experience (Demonstration)

We show that without using any domain knowledge, we can predict the final performance of a team of voting agents, at any step towards solving a complex problem. This demo allows users to interact with our system, and observe its predictions, while playing 9x9 Go.

متن کامل

Every Team Deserves a Second Chance: An Interactive 9x9 Go Experience

We show that without using any domain knowledge, we can predict the final performance of a team of voting agents, at any step towards solving a complex problem. This demo allows users to interact with our system, and observe its predictions, while playing 9x9 Go.

متن کامل

Achieving Master Level Play in 9 × 9 Computer Go

The UCT algorithm uses Monte-Carlo simulation to estimate the value of states in a search tree from the current state. However, the first time a state is encountered, UCT has no knowledge, and is unable to generalise from previous experience. We describe two extensions that address these weaknesses. Our first algorithm, heuristic UCT, incorporates prior knowledge in the form of a value function...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011